xtabs()
function creates contingency tables in frequency-weighted format. Use xtabs()
when you want to numerically study the distribution of one categorical variable, or the relationship between two categorical variables. Categorical variables are also called "factor" variables in R.
Using a formula interface, xtabs()
can create a contingency table, (also a "sparse matrix"), from cross-classifying factors, usually contained in a data frame.
Answers to the exercises are available here.
Exercise 1
xtabs()
with One Categorical Variable
Input the following required Data Frame:
Data1 <- data.frame(Reference = c("KRXH", "KRPT", "FHRA", "CZKK", "CQTN", "PZXW", "SZRZ", "RMZE", "STNX", "TMDW"), Status = c("Accepted", "Accepted", "Rejected", "Accepted", "Rejected", "Accepted", "Rejected", "Rejected", "Accepted", "Accepted"), Gender = c("Female", "Male", "Male", "Female", "Female", "Female", "Male", "Female", "Female", "Female"), Test = c("Test1", "Test1", "Test2", "Test3", "Test1", "Test4", "Test4", "Test2", "Test3", "Test1"), NewOrFollowUp = c("New", "New", "New", "New", "New", "Follow-up", "New", "New", "New", "New"))
The xtabs()
function can display the frequency, or count, of the levels of categorical variables. For the first exercise, use the xtabs()
function to find the count of levels in the variable, "Status
", within the above dataframe, "Data1
".
Exercise 2
Two Categorical Variables – Discoving relationships within a dataset
Next, using the xtabs()
function, apply two variables from "Data1
", to create a table delineating the relationship between the "Reference
" category, and the "Status
" category.
Exercise 3
Three Categorical Variables – Creating a Multi-Dimensional Table
Apply three variables from "Data1
" to create a Multi-Dimensional Cross-Tabulation of "Status
", "Gender
", and "Test
".
Exercise 4
Creating Two Dimensional Tables from Multi-Dimensional
Cross-Tabulations
Enclose the xtabs()
formula from Exercise 3 within the "ftable()
" function, to display a Multi-Dimensional Cross-Tabulation in two dimensions.
Exercise 5
Row Percentages
The R package "tigerstats
" is required for the next two exercises.
if(!require(tigerstats)) {install.packages("tigerstats"); require(tigerstats)}
library(tigerstats)
1) Create an xtabs()
formula that cross-tabulates "Status
", and "Test
".
2) Enclose the xtabs()
formula in the tigerstats function, "rowPerc()
" to display row percentages for "Status
" by "Test
".
Exercise 6
Column Percentages
1) Create an xtab()
formula that cross-tabulates "Reference
", and "Status
".
2) Use "colPerc()
" to display column percentages for "Reference
" by "Status
".
Exercise 7
Plotting Cross-Tabulations
Use the "plot()
" function, and the "xtabs()
" function to plot "Status
" by "Gender
".
Exercise 8
xtabs()
– Explanatory and Response Variables
In order to examine whether the explanatory variable "Gender
" affects the response variable " Status
", create a two factor xtabs()
formula with the Response variable as the first condition, and the Explanatory variable as the second condition.
Exercise 9
Using cbind()
with xtabs()
Using the "cbind()
" function within an xtabs()
formula can define the last two columns of a flat table of your dataset. The variable after ~ (tilde) will display as the row data. For example, ftable(xtabs(cbind(variable1, variable2) ~ variable3, data=" "))
.
For this exercise, create a flat table with columns for "Gender
" and "Test
". The row variables are "Reference
".
Exercise 10
Testing Correlation with xtabs()
When processed through the "summary()
" function, an xtabs()
formula can test for independence of variables. Therefore, use summary()
and xtabs()
to test for a "Gender
" affecting "Status
" correlation.
# Answer to Exercise 1# Data1 <- data.frame( Reference = c("KRXH", "KRPT", "FHRA", "CZKK", "CQTN", "PZXW", "SZRZ", "RMZE", "STNX", "TMDW"), Status = c("Accepted", "Accepted", "Rejected", "Accepted", "Rejected", "Accepted", "Rejected", "Rejected", "Accepted", "Accepted"), Gender = c("Female", "Male", "Male", "Female", "Female", "Female", "Male", "Female", "Female", "Female"), Test = c("Test1", "Test1", "Test2", "Test3", "Test1", "Test4", "Test4", "Test2", "Test3", "Test1"), NewOrFollowUp = c("New", "New", "New", "New", "New", "Follow-up", "New", "New", "New", "New") ) xtabs(~Status, data=Data1) ## Status ## Accepted Rejected ## 6 4 # Answer to Exercise 2# xtabs(~Reference + Status, data=Data1) ## Status ## Reference Accepted Rejected ## CQTN 0 1 ## CZKK 1 0 ## FHRA 0 1 ## KRPT 1 0 ## KRXH 1 0 ## PZXW 1 0 ## RMZE 0 1 ## STNX 1 0 ## SZRZ 0 1 ## TMDW 1 0 # Answer to Exercise 3# xtabs(~Status + Gender + Test, data=Data1) ## , , Test = Test1 ## ## Gender ## Status Female Male ## Accepted 2 1 ## Rejected 1 0 ## ## , , Test = Test2 ## ## Gender ## Status Female Male ## Accepted 0 0 ## Rejected 1 1 ## ## , , Test = Test3 ## ## Gender ## Status Female Male ## Accepted 2 0 ## Rejected 0 0 ## ## , , Test = Test4 ## ## Gender ## Status Female Male ## Accepted 1 0 ## Rejected 0 1 # Answer to Exercise 4# ftable(xtabs(~Reference + Gender + Test, data=Data1)) ## Test Test1 Test2 Test3 Test4 ## Reference Gender ## CQTN Female 1 0 0 0 ## Male 0 0 0 0 ## CZKK Female 0 0 1 0 ## Male 0 0 0 0 ## FHRA Female 0 0 0 0 ## Male 0 1 0 0 ## KRPT Female 0 0 0 0 ## Male 1 0 0 0 ## KRXH Female 1 0 0 0 ## Male 0 0 0 0 ## PZXW Female 0 0 0 1 ## Male 0 0 0 0 ## RMZE Female 0 1 0 0 ## Male 0 0 0 0 ## STNX Female 0 0 1 0 ## Male 0 0 0 0 ## SZRZ Female 0 0 0 0 ## Male 0 0 0 1 ## TMDW Female 1 0 0 0 ## Male 0 0 0 0 # Answer to Exercise 5# if(!require(tigerstats)) {install.packages("tigerstats"); require(tigerstats)} library(tigerstats) rowPerc(xtabs(~Status + Test, data=Data1)) ## Test ## Status Test1 Test2 Test3 Test4 Total ## Accepted 50.00 0.00 33.33 16.67 100.00 ## Rejected 25.00 50.00 0.00 25.00 100.00 # Answer to Exercise 6# colPerc(xtabs(~Reference + Status, data=Data1)) ## Status ## Reference Accepted Rejected ## CQTN 0.00 25 ## CZKK 16.67 0 ## FHRA 0.00 25 ## KRPT 16.67 0 ## KRXH 16.67 0 ## PZXW 16.67 0 ## RMZE 0.00 25 ## STNX 16.67 0 ## SZRZ 0.00 25 ## TMDW 16.67 0 ## Total 100.00 100 # Answer to Exercise 7# plot(xtabs(~Status + Gender, data=Data1))# Answer to Exercise 8# xtabs(~Status + Gender, data=Data1) ## Gender ## Status Female Male ## Accepted 5 1 ## Rejected 2 2 # Answer to Exercise 9# ftable(xtabs(cbind(Gender, Test) ~ Reference, data=Data1)) ## Gender Test ## Reference ## CQTN 1 1 ## CZKK 1 3 ## FHRA 2 2 ## KRPT 2 1 ## KRXH 1 1 ## PZXW 1 4 ## RMZE 1 2 ## STNX 1 3 ## SZRZ 2 4 ## TMDW 1 1 # Answer to Exercise 10# summary(xtabs(~Gender + Status, data=Data1)) ## Call: xtabs(formula = ~Gender + Status, data = Data1) ## Number of cases in table: 10 ## Number of factors: 2 ## Test for independence of all factors: ## Chisq = 1.2698, df = 1, p-value = 0.2598 ## Chi-squared approximation may be incorrect